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I.  INTRODUCTION 

The  research  reported  in  this  paper  was  motivated  by  the  following 
problem  in  the  mapping  of  two-dimensional  random  fields,  that  is  spatially 
distributed  random  quantities .  Measurements  or  surveys  of  the  field  are 
collected  at  different  times  along  sets  of  one-dimensional  tracks  across 
the  field.  Ihe  sets  of  tracks  may  differ  from  survey  to  survey.  Either 
a  local  or  regional  map  is  generated  for  each  of  these  surveys  and  the 
problem  is  either  to  combine  these  local  maps  optimally,  or  to 
update  an  overall  map  as  each  new  survey  becomes  available.  Problems  of 
this  type  arise  in  many  applications  including  the  mapping  of  vertical  tem¬ 
perature  profiles  of  the  atmosphere  given  data  provided  by  satellites  [1] 
and  the  mapping  of  anomalies  in  the  earth's  gravitational  field  and  the 
effects  of  such  anomalies  on  errors  in  inertial  navigation  systems  [2,14]. 

■Hie  problem  posed  in  the  preceding  paragraph  is  not  solved  completely 
in  this  paper,  but  a  special  case  of  it  is  in  which  the  tracks  are  all 
parallel  and  the  field  along  the  direction  of  the  tracks  cam  be  modeled  by 
a  finite  dimensional  linear  shaping  filter  driven  by  white  noise.  In  ad¬ 
dition  to  solving  this  special  case  amd  to  providing  insight  into  the 
general  case,  the  solution  we  obtain  is  of  independent  interest  in  that  it 
provides  a  procedure  for  optimally  updating  smoothed  estimates  as  more  data 
is  collected.  Furthermore,  one  of  the  principle  steps  in  our  development  is 
the  construction  of  the  optimal  combined  filtered  (i.e.  causal)  estimate 
from  several  local  filtered  estimates,  this  is  basically  a  problem  in 
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decentralized  filtering,  and  our  results  extend  those  of  Speyer  [3]  and 
Chong  [ 7] . 

In  the  next  section  we  present  and  discuss  the  solution  to  the  problem 
of  combining  decentralized  filtered  estimates,  while  Sections  III  contains 
the  description  and  solution  of  the  problem  of  updating  smoothed  estimates. 
In  Section  IV  we  apply  the  results  of  the  preceding  section  to  the  problem 
of  real-time  smoothing,  that  is,  of  estimation  given  a  previous  smoothed 
estimate  and  new  real-time  data.  The  paper  concludes  with  a  discussion  in 
Section  V. 


II.  COMBINING  DECENTRALIZED  FILTERED  ESTIMATES 
2.1  Formulation  and  Solution  of  the  General  Case 

Consider  a  linear  dynamical  system  driven  by  Gaussian  white  noise 

x (t)  P  A(t)x(t)  +  w(t)  (2.1) 

ElwftJwd)']  =  Q(t)«S(t-T)  (2.2) 

where  w(t)  is  independent  of  x(0)  which  is  taken  to  be  a  zero  mean 
Gaussian  random  variable  with  covariance  £  (0) .  Suppose  we  have  two  sets 
of  white  noise-corrupted  observations 

y^(t)  =  C^ (t)x(t)  +  v^(t)  (2.3) 

y2(t)  =  C2(t)x(t)  +  v2(t)  (2.4) 

where  v1  and  v2  are  independent  of  each  other  and  of  w  and  x(0),  with 


E[v^ (t)v^ (t) ' ]  =  Ri(t)6(t-T)  ,  i=l,2 


(2.5) 
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Hie  two  sets  of  measurements  (2.3), (2.4)  can  be  thought  of  as  representing 
observations  taken  at  different  locations  in  a  distributed  system  or  at 
different  nodes  in  a  network  of  interconnected  systems.  These  observations 
are  processed  separately  to  produce  local  filtered  estimates,  and  we  wish 
to  consider  the  problem  of  recovering  the  overall  optimal  filtered  estimate 

x(t|t)  =  E{x(t) |y1(s) ,y2(s) ,  s<t}  (2.6) 

in  terms  of  these  local  estimates.  If  this  can  be  done,  then  much  of  the 
raw  data  processing  can  be  done  locally  without  any  loss  in  global  per¬ 
formance.  In  addition,  if  local  filtering  is  performed  on  the  data,  we  may 
reduce  the  required  bandwidth  for  transmission  of  information  to  a  cen¬ 
tralized  processor.  A  problem  of  this  type  was  considered  by  Speyer  [3] 
in  the  context  of  decentralized  control.  Our  work  represents  an  extension 
of  the  estimation  portion  of  his  results.  Also,  while  we  consider  only 
two  sets  of  measurements  (2. 3),  (2. 4),  the  preceding  formulation  and  our 
analysis  of  it  extend  in  an  obvious  manner  to  the  case  of  N  sets  of 
measurements  and  local  estimates. 

In  order  to  complete  the  formulation  of  the  problem,  we  assume  that 
the  local  processing  algorithms  are  Kalman  filters  based  on  different 
models : 


X.  (t)  =  A. (t)x. (t)  +  w. (t)  , 

1  IX  X 

i=l,2 

(2.7) 

E[w^  (t)w^ (T) * ]  =  (t)6(t-t) , 

i-1,2 

(2.8) 

y. (t)  =  H. (t)x. (t)  +  v, (t), 
x  li  x 

i=l,2 

(2.9) 

where  x^(0)  is  taken  to  be  zero-mean  with 

covariance  £ ^  (0) . 

It  is 
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important  to  emphasize  here  that  (2.7)- (2.9)  represents  a  model  whose 
sole  purpose  is  for  the  design  of  local  Kalman  filters.  This  model  may 
not  accurately  reflect  the  actual  statistics  of  y^.  At  the  moment  we  are 
assuming  no  relationship  between  the  local  model  (2.7)- (2 .9)  and  the 
correct  global  model  (2.1) -(2. 5),  except  for  the  assumption  that  the  v^ 
in  (2.9)  are  the  same  as  in  (2.5)  (i.e.  that  they  have  the  same  statistics, 
so  at  least  the  measurement  noise  is  modeled  in  the  same  fashion  locally 
and  globally) .  As  we  need  to  impose  some  relationship  between  local  and 
global  models,  we  will  do  so. 

Given  these  local  models ,  the  equation  for  each  local  processor  is 
given  by  the  following,* 


x.  (tit)  =  [A.-P.h’.rTV  ]x.  (tit)  +  P.HJrTV  (t)  (2.10) 

i*  xxxxxx1  i  i.  i 

The  covariance  P^  can  be  precomputed  from  either  of  the  following 

equations ; 


r  x 


+  p.a!  +  Q.  -  p.hIrTVp. 

IX  X1  11111 

(2.11) 

-1  ■  -1  -1  -1 

i  -1 

(2.12) 

-P.  A.  -  A. P .  -  P.  Q.p. 

11  11  1  *1  1 

+  H.R.  H. 

11  1 

with  the  initial  condition 

Pi(0)  =  1.(0) 


(2.13) 


The  problem  to  be  solved  is  to  obtain  an  algorithm  for  computing 


*  From  this  point  of  the  explicit  time  dependence  of  matrices  will  be 
suppressed.  If  a  particular  matrix  is  constant,  we  will  explicitly 
state  this  in  order  to  avoid  confusion. 
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the  global  estimate  x  in  (2.6)  in  terms  of  x^  and  x2.  Speyer  in  [3]  solved 
this  problem  when  the  local  model  are  the  same  as  the  global  model,  and  we 
will  comment  on  the  simplifications  that  occur  in  that  case  shortly. 

Allowing  the  local  models  to  differ  from  the  global  model  leads  to  several 
potential  advantages.  For  example,  presumably  the  local  models  are  lower¬ 
dimensional  than  (2.1)  and  represent  the  important  dynamics  at  that  particular 
location  in  the  distributed  system  or  network.  Therefore,  the  local 
processors  can  be  made  far  less  complex  than  the  global  processor.  Of  course, 

AAA 

we  cannot  recover  x  from  x^  and  x2  for  arbitrary  choices  of  local  models, 

but  the  conditions  needed  are  quite  weak.  Specifically,  as  we  will  see,  the 

only  condition  that  is  required  is  that  there  exist  (possibly-varying) 

matrices  M,  and  M„  such  that 
1  2 

CL  =  HiMi  ,  i-1,2  (2.14) 

Equation  (2.14)  and  its  implications  deserve  some  comment.  First 
note  that  (2.14)  is  equivalent  to 

R(Ci)  C  R(Hi)  ,  1*1,2  (2.15) 

or  equivalently  that 

R(Ci)X  3  ROiJ1  ,  i-1,2  (2.16) 

What  these  conditions  say  is  that  if  any  set  of  components  of  H^x^  are 
linearly  interrelated,  then  the  same  set  of  components  of  C^x  must  have 
exactly  the  same  linear  interrelationship.  That  is,  if  the  local  models 


(2.7)-(2.9)  assume  any  redundancy  among  the  sensed  quantities  —  i.e.  the 
components  of  y^  —  then  that  redundancy  must  actually  exist  in  the  global 
model.  Note  that  if  (2.15)  is  satisfied,  valid  choices  for  and  M 2  are 

Mi  =  HiCi  '  i=1'2  (2'17) 

(where  "+"  denotes  pseudo- inverse)  and  the  choice  is  unique  only  if 
N(HJ  =  {0}. 

Thus,  the  dynamics  (2. 7),  (2. 8)  can  be  totally  arbitrary,  as  long  as 
(2.15)  or  (2.16)  is  satisfied.  For  example,  one  implication  of  this 
condition  is  that  the  dimension  of  x^  must  be  at  least  as  large  as  the 
number  of  linearly  independent  components  of  the  measurement  vector  y^. 

However,  the  condition  (2.15)  is  sufficiently  weak  that,  if  we  desire,  we 
can  always  choose  a  local  model  of  this  minimal  dimension  that  satisfies 
the  condition.  Therefore,  the  conditions  does  not  require  that  there  by 
any  physical  relationship  between  the  local  states ,  x^  and  x^ ,  and  the 
global  state  x.  On  the  other  hand,  (2.14)  suggests  an  interpretation  of 
x^  as  being  a  part  of  the  global  state,  specifically  M^x.  If  this  is  the 
case,  then  (2.7)  implies  that  this  part  of  the  state  is  decoupled  from  the 
remaining  part  of  x  in  the  sense  that  J^x  is  itself  a  Markov  process.  This 
is,  of  course,  not  the  usual  case  in  practice,  where  approximations  are  made 
in  assuming  that  the  couplings  between  the  local  states  can  be  neglected  or 
cam  be  replaced  by  additional  white  noise  sources.  What  our  results  say  is 

A 

that  as  long  as  (2.14)  holds,  for  the  purposes  of  reconstructing  x,  it  doesn't 

i 
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matter  if  (2.7)  is  an  exact  or  approximate  expression  for  the  evolution 
of  the  local  state.  If  x^  actually  equals  M^x,  we  obtain  some  simplifi¬ 
cations  in  the  equations  that  define  our  algorithm,  and  we  will  discuss 
these  at  the  end  of  this  section. 

As  a  first  step  in  deriving  our  algorithm,  consider  the  Kalman  filter 
for  the  calculation  of  the  global  estimate  x: 

x (t  1 1 )  =  tA-PCp*~1C1-PC^R"1C2]x(t|t)  +  PC^1yL(t)  +  PC2R21y2(t)  (2.18) 
where  P  can  be  calculated  from 

P  =  AP  +  PA'  +  Q  -  PC^r'^P  -  PC^r”1^  (2.19) 

P(0)  =  1(0)  (2.20) 

The  solution  to  the  problem  we  have  posed  can  be  obtained  as  follows . 
Rearranging  (2.10)  we  have* 


hIrTV  =  pT1{x.-[A.-P.h'r71H.  ]x.  } 


(2.21) 


Examining  (2.18),  we  see  that  the  quantities  needed  in  the  calculation  of 

A  I  “  2.  I  ”  1 

x  are  C^R^  y^  and  C2R2  These  can  be  obtained  from  (2.21)  only  if 

matrices  and  M2  exist  that  satisfy  (2.14).  Assuming  that  this  is  the 
case,  we  can  combine  (2 .14) ,  (2 .18) ,  and  (2.21)  to  obtain 


*  Note  that  we  have  implicitly  made  one  other  assumption  about  the  local 
models,  in  that  in  (2.21)  we  are  assuming  that  P^  is  invertible.  This  will 

be  guaranteed  as  long  as  (0)  is  invertible 
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x  =  [A-PC^R"1^  - 


+  PM^P^^-tA^P^'R"1^]^} 

+  ™2P2l(*2  -‘W^vy 


(2.22) 


In  order  to  simplify  notation,  define  the  following  quantities: 


F  =  A-PC'r"1^  -  PC’r'V, 

111  222 


F.  =  A.-P.hIrTV 

1  111X1 


G.  =  PM.'pT1 

1  li 


i=l,2 


(2.23) 

(2.24) 

(2.25) 


Then,  in  order  to  avoid  differentiating  x^  and  x2  in  (2.22),  we  define 


5  -  x  -  =A  -  g2x2 


(2.26) 


and  differentiating,  we  find  that 


5  =  F£  +  +  K2x2 


X  =  5  +  Vi  +  G2X2 

where 


(2.27) 

(2.28) 


K.  =  FG.  -  G.  -  G.F. 

l  1  1  11 


,  i-1,2 


(2.29) 


-1  * 

If  we  use  the  differential  equations  for  P^  P  and  P,  (2.29)  becomes 


*  Note  that  in  (2.29)  we  have  implicitly  assumed  that  and  M2  are  dif¬ 
ferentiable.  Again  this  is  not  a  particularly  restrictive  condition.  For 
example,  in  the  time-invariant  case  it  is  certainly  true,  since  M  and  M 
can  be  taken  to  be  constants.  1  2 
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K. 

1 


=  [pmIptVp:1  -  qm:?:1] 


Ill  X 


1  1 


+  [pmIaIpT1  -  pa'mjpT1 

ill  ii 


PKjp'1] , 


i=l,2 


(2.30) 


If  all  of  the  models,  local  and  global,  are  time- invariant  and  if  we 
consider  the  steady- state  case,  then  the  above  solution  still  applies 
(with  M.=0)  and  is  also  time- invariant. 

l 

This  is  the  general  solution  to  the  problem  of  combining  decentralized 
maps.  In  addition,  this  solution  can  be  directly  adapted  to  the  problem 

A  A 

of  computing  x  from  x^  and  y^ .  This  is  of  interest  in  situations  in  which 
one  local  processor  transmits  information  to  a  global  processor  that  has 
measurements  of  its  own.  We  can  solve  this  problem  by  returning  to 

(2.18),  and  instead  of  replacing  both  c'^R^y^  and  C2R2^y2  expressions 

.  A  A  .  —1 
m  terms  of  x.  and  x. ,  we  make  this  substitution  only  for  C,R.  y, .  The 
ii  111 

remaining  analysis  is  analogous  to  that  carried  out  previously,  and 
the  result  is 


where 


x  =  p  +  G1xi 


p  =  Fp  +  K1xi  +  PC;R-1y2 


(2.31) 


(2.32) 


Here  F,  K^,  and  G^  are  the  same  as  given  previously. 

In  the  next  two  subsections  we  present  two  special  cases  which 
result  in  some  simplifications  in  (2 . 23) -  (2 .32)  and  consequently  allow 
us  to  interpret  our  result  in  more  detail. 
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2 . 2  The  Special  Case  of  Identical  Local  and  Global  Models 

In  this  section  we  consider  the  case  examined  by  Speyer  in  [3] . 

Specifically,  we  assume  that  the  models  used  by  the  local  processors  are 
identical  to  the  global  model.  Ihat  is, 

A  =A  =A,  Q=Q=Q,  C  =H  ,  C  =H  ,  M  =M  =1  (2.33) 

X  &  X  X  X  X  X  4  X  X 

In  this  case  the  expressions  for  and  K2  simplify  to 

K.  =  PpTVT1  -  QpT1  =  (PP^-DqpT1  (2.34) 

1  11  1  1  *1 

and 

x  =  C  +  Pfp"1^  +  p"1^)  (2.35) 

/v 

Note  that  the  second  term  in  the  expression  for  x  is  the  usual  expression 

for  combining  independent  estimates  [4,5] .  However  x^  and  x^  are  not  independent 

in  general,  and  £  represents  a  correction  for  this  correlation. 

A  A 

The  reason  that  x^  and  x2  are  not  independent  estimates  is  that  they 
are  based  not  only  on  measurements  with  independent  noises  but  also  on  a 
priori  information.  Specifically,  both  of  the  local  estimates  incorporate 
statistical  descriptions  of  x(0)  and  w(t),  and  thus  the  errors  in  both 
estimates  are  correlated  with  these  processes .  It  is  the  correlation  with 
the  process  w(t)  that  leads  to  the  need  for  a  dynamical  correction  (£)  to 

A  A 

account  for  the  correlation  in  the  processes  x^  and  If  Q=0  (i.e.  if 

A 

w(t)  is  not  present),  then  K^Oand  hence  £=0,  and  x  is  a  memoryless  function 

A  A 

of  x  and  x_.  In  this  case  it  is  straightforward  to  show  that 
1  * 


and 


(2.36) 


x(t|t)  +  P(t)  IP”1  tt)^  (t  |t)  +  p'|t)x2  (t|t)] 

P  1(t)  =  p"1^)  +  p"1(t)  -  (2.37) 

where  £(t)  is  the  unconditional  covariance  of  x(t).  In  general  £(t) 
satisfies 

I (t)  =  A(t)J(t)  +  £(t)A'(t)  +  Q (t)  (2.38) 

with  £(0)  given.  Equations  (2.36)  and  (2.37)  hold  only  in  the  case  when 

Q  is  zero.  Note  that  even  in  this  case  x^  and  x2  are  not  independent 

estimates  because  of  the  correlation  of  the  estimation  errors  with  x(0). 

Following  the  work  of  Wall  [4],  we  can  interpret  (2.36)  and  (2.37)  as 

follows.  We  have  three  sources  of  information  on  which  to  base  our 

estimate  of  x(t),  the  measurement  processes  y^  and  y^  and  the  a  priori 

information  about  x(t),  provided  by  the  unconditional  propagation  of  the 

mean  and  variance  from  the  specified  statistics  of  x(0).  The  estimate  x^ 

uses  y^  and  the  a  priori  information,  which,  therefore  is  used  twice. 

Equation  (2.37)  corrects  for  the  fact  that  both  P, ^  and  P„^  reflect  the 

1  2 

uses  of  this  information.  Also,  (2.36)  is  the  correct  expression  under  the 
assumption  that  x(0)  is  zero  mean.  If  this  is  not  the  case,  that  is  if  its 
mean  m(0)^0,  then  (2.36)  is  replaced  by 

x(t|t)  =  P(t)  [P'^ttlt)  +  P"1^  (t|t)  -  £_1  (t)m  (t)  ] 


(2.39) 
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where  m(t)  is  the  unconditional  mean  of  x(t)  which  satisfies 

m(t)  =  Am(t)  (2.40) 

Again  we  see  the  "subtracting  out"  of  the  effect  of  a  priori  information, 
so  that  the  duplication  of  this  information  is  removed. 

Finally,  note  that  K^=0  also  if  P=P^.  However,  this  is  only  the 
case  if  the  other  set  of  measurements  contains  no  information.  In  general, 
if  the  system  is  observable  from  each  set  of  measurements,  (pp.^-l)  will 
be  invertible.  Of  course,  all  of  the  previous  statements  have  certain 
obvious  generalizations.  For  example,  if  part  of  the  state  is  uncontrol¬ 
lable  from  the  noise,  then  the  corresponding  part  of  x  is  a  memory less 
•  A  ^ 

function  of  x^  and  x^.  Also,  if  one  set  of  measurements,  say  set  1, 
contains  no  information  about  a  part  of  x,  then  the  corresponding  parts 
of  P  and  P2  are  identical. 

2 . 3  The  Case  in  Which  the  Local  Model  is  a  Subsystem  of 
the  Global  Model 

In  some  cases  the  dynamics  of  one  of  the  local  models  may,  in  fact, 
be  the  exact  dynamics  of  a  subsystem  of  the  global  model.  Specifically, 
if  this  is  true  of  local  model  1,  then 

x.(t)  =  (2.41) 

Equation  (2.41)  has  several  important  implications.  Since  x^  satisfies 
(2. 7), (2. 8),  and  x  satisfies  (2.1),  (2.2),  equation  (2.41)  states  that 
the  Markov  process  x(t)  has  a  subprocess,  namely  x^t),  that  is  Markov 
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by  itself.  Differentiating  (2.41)  and  using  (2.1)  and  (2.7)  we  have  that 


A1M1X+W1  =  A1X1+W1  =  X1  =  M1X  +  M1X  =  MlX+MlAx  +  M1W 


(2.42) 


and  from  this  we  conclude  that 


and 


A1M1  =  Mi  +  MiA 


w  =  M^w 


which  implies  that 


Q1  =  M1QM1 


(2.43) 


(2.44) 


(2.45) 


Also,  directly  from  (2.41)  we  have  that 


ei  ■ 


(2.46) 


Note  thit  from  (2.46)  it  is  clear  that  E^  is  invertible  only  if  is 

onto  (assuming  that  E  is  invertible).  We  will  assume  that  this  is  the  case> 

since  from  (2.41)  we  see  that  any  other  choice  for  leads  to  an  x^  with  fewer 

degrees  of  freedom  than  it  has  components.  In  addition,  under  these 
conditions,  the  expression  for  simplifies: 


(2.47) 


this  equation  bears  some  resemblance  to  the  form  of  the  gain  when  the 


local  model  is  the  same  as  the  global  model .  In  order  to  gain  further 
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insight,  we  wish  to  consider  a  particularly  convenient  fora  for  the 
global  model.  This  is  done  by  choosing  a  basis  for  the  global  state 
space  so  that  the  components  of  x^  are  the  first  components  of  x. 

Assuming  without  loss  of  generality  that  the  global  model  is  in  this  fora, 
then 


M  =  (X  I  0) 
X  • 


(2.49) 


(2.50) 


(2.51) 


(2.32) 


(2.53) 


This  fora  is  illustrated  in  Figure  2.1.  Note  from  the  figure  that  it  is 
clear  that  the  global  system  is  not  observable  from  y^  alone.  This  is 
not  surprising  given  that  x^  is  Markov  by  itself. 


Using  (2.48)- (2.53) ,  equation  (2.47)  becomes 


where 


P  = 


/l(p>n  Px  -11C1P1 

\kp>;2  pj1-  s^ip;1 

/<  P>11  (P)12 

\(P)12  (P)22 


(2.54) 


(2.55) 


From  this  and  the  previous  equations  and  from  the  figure  we  can  get  a 
clearer  picture  of  the  structure  of  our  solution  in  this  case*.  Since 
is  partitioned,  let  us  consider  each  part , individually .  The  first  piece, 
t  (P)  is  exactly  of  the  form  of  the  gain  that  we  saw  in  the 

preceding  subsection  when  the  local  and  global  models  are  identical,  (see 
equation  (2.34)).  This  is  not  surprising,  as  the  first  piece  of  the 
global  state  x  is  nothing  more  than  x^,  for  which  the  local  and  global 
models  agree.  Therefore,  the  incorporation  of  into  a  global 

estimate  of  this  piece  of  x,  given  and  y^,  is  the  same  as 


*  In  the  following  discussion  we  use  the  notation  developed  previously. 

Thus  x^  refers  to  the  local  estimate  of  x^  given  y^  (P^  is  its  locally-computed 

covariance)  and  x  refers  to  the  global  estimate  of  x  given  y^  and  y^.  (global 

covariance  P) .  In  the  particular  esse  being  examined  here  x  =  /  x^  \  ,  and 

\v/  M 

therefore  there  is  some  chance  of  confusion .  We  have  attempted  to  reduce  this 
chance  by  using  x.  and  x  only  in  the  senses  described  above.  Also,  we  have 
denoted  the  upper  left-hand  block  of  P  by  (P)^  (see  (2.55))  to  distinguish 
it  from  P-.  Here  (P)  .  is  the  estimation  error  covariance  of  x^^  given  y^ 
and  y  ,  while  P  is  tne  error  covariance  based  only  on  y^ . 


the  problem  we  considered  in  Subsection  2.2. 

The  second  piece  t  essentially  tells  us  how  to 

use  the  estimate  of  to  obtain  an  estimate  of  the  remaining  part  of 
the  state.  Consider  for  the  moment  the  case  in  which  there  is  no  second 
set  of  measurements,  that  is,  when  c2i=C22=0‘  In  this  case  we  have  a 
cascade  interconnection  of  two  systems  and  measurements  from  only  the 
first  of  these.  It  is  clear  that  under  these  conditions  (P)  ■=P^, 

which  merely  states  that  local  processor  #1  produces  the  best  filtered 
estimate  of  given  y  .  From  (2.54)  we  see  that  this  observation  is 
consistent  with  the  fact  that  the  first  part  of  is  zero.  Also,  the 
second  piece  of  becomes 

and  using  (2 .23)- (2 .28)  and  (2.54),  the  optimal  estimator  for  y  becomes 

Y  H-lsySi*!  12.57) 

"  -  *22"  +  <2-S7> 

These  equations  describe  how  the  optimal  estimate  of  the  unobservable 
part  of  a  system  can  be  constructed  from  the  optimal  estimate  of  the 
observable  part.  It  is  worth  noting  that  this  particular  special  case  is 
of  practical  importance,  for  example,  in  navigation  systems  in  which 
accelerations  are  sensed  and  in  which  velocities  and  positions  are  to  be 
estimated.  Our  result  states  that  the  acceleration  measurements  can  be 


processed  first  (locally)  to  produce  optimal  acceleration  estimates,  and 
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these  estimates  can  then  be  used  (perhaps  in  a  centralized  processor) 
to  compute  the  optimal  estimates  of  velocity  and  position.  Again  the 
transmission  of  filtered  measurements  may  be  done  more  efficiently  them 
the  transmission  of  the  raw  data,  and  the  complexity  of  the  two  processors 

A  A 

(for  x-^and  for  y)  are  each  less  than  the  complexity  of  a  global,  cen¬ 
tralized  estimator  for  x.  Such  a  procedure  may  also  be  of  value  even  if 
y2  is  present,  for  example,  if  we  do  have  velocity  or  position  sensors. 

In  this  case,  from  eq.  (2.32)  we  see  that  our  results  tell  us  how  to 
reconstruct  the  optimal  estimate  of  acceleration,  velocity  and  position 
in  terms  of  velocity  and  sensor  measurements  and  the  estimate  of  ac¬ 
celeration  obtained  by  processing  the  accelerometers  alone.  Again  there 
may  be  transmission  savings  in  transmitting  this  estimate  rather  than 
the  raw  accelerometer  data,  and,  in  addition,  there  may  be  implementation 
advantages  in  breaking  the  overall  optimal  estimator  into  smaller  pieces . 

Note  also  from  (2.47)  that  K^=0  if  Q=0.  In  fact,  from  (2.49)  and 
(2.51)  (together  with  the  fact  that  Q  must  be  zero  if  Q  is),  we  see 

that  K^=0  if  0^=0.  In  this  case,  whether  y2  is  present  or  not,  x 

/\ 

depends  on  x^  in  a  memoryless  fashion.  This  is  best  understood  by 
noting  that  with  £>^=0,  x^  is  a  time- varying  bias* 

xx(t)  =  $1(t,0)x1(0)  (2.58) 

and  it  also  produces  a  time-varying  bias  in  y 

*  Here  ^  is  the  state  transition  matrix  associated  with  A^ .  Similarly 
4>22  is  the  state  transition  matrix  for  A^. 
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Y(t)  =  <f22(t'°)Y(0)  + 


/ 


$22 (t,x)A21(T)x1(x)dt 


/ 


$22  (t,x)  [0:i]w(x)dX 


(2.59) 


The  measurements  y^  provide  information  about  the  second  term 
in  (2.59),  which  can  be  rewritten  as 


I 


$22 (t,X)A21 (X) $(X,t)dX 


(t) 


(2.60) 


Thus  the  best  estimate  of  y  given  the  measurements  y^  is  simply  a 
memory less  function  of  x^.  For  example,  if  we  do  not  have  a  second 
set  of  measurements  (C21=C22=0) ,  then  (2-28)  reduces  to 


i  — 1a 

x  =  PM^  xj_ 


(2.61) 


where  is  as  in  (2.49)  and  P  is  given  by  (2.54).  Therefore 


-  .  -1~ 

Y  =  ^ 


(2.62) 


III.  THE  SMOOTHING  UPDATE  PROBLEM 

Consider  the  formulation  described  in  the  preceding  subsection,  but 
with  the  following  additional  features:  (1)  we  observe  the  measurements 
over  the  time  interval  [0,T];  (2)  the  data  are  processed  locally  to 
produce  the  smoothed  estimates 


-20- 


xis(t)  =  E[xi(t) |yi(T) ,  0<T<T] 

based  on  their  local  models;  and  (3)  we  wish  to  compute  the  overall 
smoothed  estimate 


x  (t)  =  E[x(t)  y,  (t) ,y_  (x) ,  0<t<T] 

S  12  - 


using  only  the  smoothed  estimate  histories  x  and  x  .  A  second,  very 

XS  ^  o 

A  A 

closely  related  problem  is  that  of  computing  x^  in  terms  of  x^g  and  y^. 

As  we  discussed  for  the  filtering  problem  at  the  end  of  Section  2.1,  the 
solution  to  the  first  of  the  smoothing  problems  will  also  provide  us 
with  a  solution  for  the  latter.  Therefore  we  will  do  our  analysis  for 
the  first  problem  and  will  comment  on  the  second  problem  afterwards. 

The  motivation  for  these  questions  comes  from  problems  in  map  updating. 
Suppose  that,  as  illustrated  in  Figure  3.1,  we  are  interested  in  the 
estimation  of  a  two-dimensional  random  field  given  data  obtained  from 
parallel  tracks.  Problems  of  this  type  arise  in  the  mapping  of  gravita¬ 
tional  anomalies  given  data  obtained  along  tracks  over  the  ocean  [2 ,14 ] 
and  the  mapping  of  meteorological  variables  from  data  gathered  by 
satellites  [1] .  In  this  case  we  can  think  of  x^  as  representing  the 
variables  to  be  mapped  along  the  ith  set  of  tracks  or  over  some  portion  of 
the  field  including  these  tracks .  The  global  state  x  then  represents  the 
field  along  all  of  the  tracks  or  regions  surrounding  each  of  the  tracks. 
Note  that  our  model  allows  repeated  or  overlapping  surveys  (x^=x^.  for 
repeated  surveys ,  while  some  components  of  x^  are  the  seme  as  components 
of  Xj  if  the  surveys  overlap) . 
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As  in  the  preceding  section  we  will  assume  dynamical  models  for 
x  and  along  the  direction  of  the  tracks .  This  can  be  done  if  the 
underlying  field  has  a  particular  characterization.  Specifically,  let 
f(t,s)  denote  the  two-dimensional  random  field,  which  we  assume  to  be 
Gaussian  and,  for  simplicity,  zero  mean.  Here  t  is  the  direction  in 
which  the  tracks  are  taken  —  i.e.  we  observe  the  field  over  lines  of  the 
form  {(t,s^) | 0  <t<T }  for  several  values  of  s^.  What  we  require  is  that  the 
set  of  processes  f(t,s^)  jointly  have  a  finite-dimensional  shaping  filter 
representation.  Define  the  2-D  correlation  function 

R  (t ,  T ;  S  ,  0)  *  E[f (t.s)f' (T,0)]  (3.1) 

As  in  1-D ,  this  function  has  a  certain  symmetry  property.  Using  (3.1) 
it  is  readily  seen  that 

R(t ,  T;  S  ,  0)  =  R(T ,t; 0,s) '  (3.2) 

Thus,  if  we  specify  R(t,T;s,a)  for  all  4-triples  (t,T,s,a)  with  t>T, 
we  will  have  completely  specified  it.  Other  properties  of  f  can  be 
reflected  in  properties  of  R.  For  example  if  f  is  stationary,  then 

R(t,T,-S,G)  =  R(t-T,S-0)  (3.3) 

and  if  we  specify  R(l,o)  for  T>0  and  0  arbitrary,  we  will  have  completely 
specified  it. 

the  requirement  that  any  set  of  tracks  f(t,s^)  be  realizable  as  the 
output  of  a  finite  dimensional  shaping  filter  places  additional  restrictions 


on  R.  One  important  case  in  which  this  is  true  is  when  R  is  separable 

R(t , T ; S  ,  0)  =  R1(t,T)R2(s,a)  (3.4) 

with  R^  and  R2  square, 

R^  (t , T)  =  R^(T,t)  ,  R2(s,0)  =  R2 (a,s)  (3.5) 

and  R^  itself  being  separable 

Rx(t,T)  =  H(t)G(T)  ,  t>T  (3.6) 

It  is  not  difficult  to  see  that  (3.2)  ,  (3.4) ,  (3.5)  imply  that  R^ 

and  R2  commute  for  any  values  of  their  arguments.  The  case  (3.4)- (3.6) 

is  a  slight  generalization  of  classes  of  processes  considered  by  others. 

For  example,  if  we  had  further  assumed  a  separability  condition  for  R2 

as  in  (3.6)  and  made  the  field  stationary  (so  that  (3.6)  becomes 
FT 

R^  (t)  =  He  G,  t>0) ,  we  would  have  the  continuous-space  version  of  the 
model  considered  by  Attasi. 

While  the  restriction  of  finite  dimensionality  and  the  scenario  of 
Figure  3.1  are  quite  special,  this  problem  is  of  interest  in  the  applica¬ 
tions  cited  earlier,  and  it  opens  up  some  interesting  technical  problems 
which  we  will  discuss  and  solve.  Furthermore,  we  feel  that  our  results 
do  shed  some  light  on  the  issues  involved  in  assimilating  spatially- 
distributed  data  and  combining  regionally-processed  information  and  as 
such  represent  a  modest  first  step  towards  the  goal  of  solving  less 
restricted  versions  of  these  problems. 
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Recall  that  in  the  preceding  section  we  solved  the  causal  —  i.e. 
filtering  —  version  of  problems  of  combining  and  updating  estimates . 

We  found  that  a  solution  existed  with  effectively  no  restrictions  on  the 
relationship  between  the  local  and  global  models  (except  the  existence  of 
in  (2.14)).  In  this  section  we  are  interested  in  the  noncausal  versions 
of  this  problem.  As  we  will  see,  the  updating  problem  always  has  a  solution, 
while  the  combining  problem  can  be  solved  only  when  some  further  restriction, 
which  also  is  not  particularly  severe,  is  placed  on  the  local  models. 

In  the  next  subsection  we  will  develop  the  basic  ideas  behind  our  approach 
and  will  pointout  where  the  difficulty  arises.  In  the  following  two  sub¬ 
sections  we  will  address  the  two  special  cases  considered  in  Subsections 
2.2  and  2.3,  which  are  the  most  important  for  random  field  mapping,  and  we 
will  see  that  the  difficulty  can  be  overcome  in  these  cases. 

3.1  The  General  Case 

The  starting  point  for  our  analysis  is  the  two-filter  form  for  the 
optimal  smoother.  In  particular,  we  will  follow  the  approach  described  in 
[4] .  In  this  approach  the  smoothed  estimate  is  a  weighted  combination  of 
a  forward  estimate,  produced  by  the  usual  Kalman  filter,  and  a  reversed  es¬ 
timate,  produced  by  a  Kalman  filter  based  on  a  reversed-time  Markov  model. 
This  approach  has  the  advantage  of  not  involving  infinite  initial  error  co¬ 


variances.  For  all  of  this  we  assume  that  x(0)  is  zero  mean  and  that  the  local 
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processor  filters  (forward  and  reverse)  are  initialized  at  zero.  Nonzero 
initial  conditions  can  easily  be  accommodated,  because  of  linearity,  and  one 
should  then  think  of  all  of  the  variables  in  our  formulation  as  describing 
deviations  from  the  a  priori  mean. 

Let  us  summarize  the  smoother  equations  for  each  of  the  two  local 
processors.  For  ease  of  reference,  all  of  the  relevant  equations  are 
collected  here.  The  forward  estimator  for  processor  i  (i=l,2)  is  given 
by 


/\  —  1  ^  I  —  1 

X.^  =  [A. -P. _H. R.  H. ]x.  +  P..H.R.  y. 

if  l  if  l  1  1  if  if  1  1  1 


(3.3) 


where  cam  be  precomputed  from  either  of  the  equations 


-1. 


P.,  =  A.  P .  +  p.£a!  +  Q.  -  P.-H’r.  H.P.. 

if  l  if  if  i  *i  if  i  i  1  if 


(3.4) 


JT  (P?b  =  -pT^a.  -  a!p7*  -  pT^Q.pT^  +  h!r7V 

dt  if  if  1  l  if  if  i  if  ill 


(3.5) 


Note  that  these  equations  are  essentially  the  same  as  (2.10)- (2.12) . 

The  reverse  time  estimator  involves  the  unconditional  covariance  for 
the  local  model  assumed  by  the  processor,  which  can  be  calculated  from 


y.  =  A.y.  +  y.A!  +  q. 

‘'I  1^1  — 1  1  1 


or 


(3.6) 


(3.7) 


The  reverse-time  estimator  operates  backward  in  time  from  t=T  and  is 


given  by 


A  n-  1  I  -1  A  |  —  1 

-X.  =  t-A.-Q.y.  -  P.  H.R.  H. ]x.  +  P.  H . R  y . 

ir  i  ir  i  i  l  ir  ir  l  i  i 


(3.8) 


which  is  a  backward  Kalman  filter,  with  covariance  also  calculated 

backward  in  time  (with  initial  condition  P.  (T)  =  7. (T) )  from  either  of 

ir  *-i 

the  following  equations 


-P.  =  -[A.+Q.r1]P.  -  P.  [A.+Q.yT1]' 

ir  i  *1^1  ir  ir  l  ^i^i 


+  Q.  -  P.  H.'rT'Vp. 

ir  i  i  i  ir 


(3.9) 


f-(pT1)  =  pt1[a.+q.  y.1]  +  lA.+Q.y.1)^:1 

dt  xr  xr  l  *1^1  1  *i^i  ir 


-pT^.p?1  +  H.rTV 

ir  1  xr  1x1 


The  smoothed  estimate  is  then  given  by 


(3.10) 


where 


S\  —  2_/\  “1^ 

X.  =  P.  [P.  „X.  _  +  P.  X.  ] 
is  is  if  if  ir  ir 


.t1  -  »:}  *  p:1  - 1;1 

is  if  ir  H 


(3.11) 


(3.12) 


Note  that  (3.12)  again  reflects  the  fact  that  x . ,  and  x.  are  not 

if  ir 

independent  estimates,  as  they  both  utilize  a  priori  information, 
in  this  case  the  information  concerning  x(0).  Here  (3.12)  holds  for 
any  value  of 
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The  overall  smoothed  estimate  satisfies  a  similar  set  of  equations 


A  “1a  -1a. 

xs  ■  VP£  *f  +  Pr  xr 

(3.13) 

p-1  ,  p-1  .  p-1  .  J'1 
s  f  r  L 

(3.14) 

l  =  Al  +  Ia'  +  Q 

(3.15) 

It  ( fi, . .  rw1  - 1\  r1 

(3.16) 

•  •  “1  • 

=  AP  +P _A*  +Q-PC  R  C,P  -P  C  R  CP, 
f  ff  flllff222f 

(3.17) 

k  (pf1>  • 

-Pr  =  -tA+Qr1]Pr  -  PrtA+Q^  1]  +  Q 

(3.18) 

-PrCiRilciPr  -PrC2R2_lc2Pr 

(3.19) 

It  (Pr1)  =  PrltA+C^-:Ll  +  [A+Q^  1]'Pr1  ' 

+  ciRilci  +  c2r21c2 

(3.20) 

Using  the  results  of  the  previous  section  we  can  calculate  in 
terms  of  xlf  and  x2f,  and,  by  looking  at  the  problem  in  reverse  time, 

A  A  A 

we  can  use  the  same  result  to  compute  x^  in  terms  of  x^r  and  x^.  T"e 
resulting  equations  are 


(3.21) 


xf  =  h  +  6i«*if  +  G2f“2f 


Ff^f  +  ^f^lf  +  K2f*2f 

(3.22) 

Ff  =  A  -  -  PfC2R2  C2 

(3.23) 

Ci£  ‘  Wif1 

Kif  ■  ipfMipl£sipIf  - 

(3.24) 

.  [P^P^  -  PfA'MlP’Jl  -  pf»;p:p 

reverse  time 

A  A  A 

(3.25) 

x  =  i  +  G,  x,  +  G.  x_ 
r  r  lr  lr  2r  2r 

(3.26) 

^r  =  Fr^r  +  KlrXlr  +  K2rX2r 

(3.27) 

Fr  '  -»-8r1-Prc;Pi1VPrC^S 

(3.28) 

G.  =  P  m'.pT1 
ir  r  r  lr 

k.  =  (p  mIpT1  q.pT1  -  qm'pT1] 

ir  r  l  ir  *1  ir  ir 

(3.29) 

+  {PrMi['Ai"SlQilPir  '  +  Pr**iPir  (3'30) 


From  (3.13) , (3.21) , (3.22) ,  (3.26) ,  and  (3.27),  we  now  have  am 
algorithm  for  calculating  xg  from  x^f,  Xlr'  X2f'  an<*  X2r"  ^at  we  wou^ 
like  is  to  compute  xg  in  terms  of  x^g  and  x2g.  To  see  when  ^ow 
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The  last  two  terms  on  the  right-hand  side  represent  the  type  of 
combination  of  estimates  one  would  expect  if  the  two  sets  of  measurements 
had  independent  sources  of  error.  However,  as  we  have  seen,  they  are 
correlated,  and  thus  we  have  the  correction  terms  to  account  for  this 
correlation. 

We  have  now  reduced  the  algorithm  for  calculating  xg  to  equations 

(3.31),  (3.22),  and  (3.27).  We  have  eliminated  x._  and  x.  and 

if  ir 

replaced  then  with  x^g  in  (3.31),  but  (3.22)  and  (3.27)  still  involve  the 
forward  and  reverse  estimates.  Our  goal  is  to  try  to  perform  a  combination 
of  terms,  as  was  done  to  obtain  (3.31),  in  order  to  replace  these 

A 

estimates  with  x^g.  However,  we  cannot  perform  this  in  the  same  simple 
manner  as  was  used  earlier  at  least  for  the  equations  we  have  here.  For 

A  A 

example,  (3.22)  involves  x_  but  not  x.  ,  so  we  cannot  combine  terms  to 

if  -  ir 

A 

obtain  x^g.  Rather,  we  have  something  that  more  closely  resembles  an 

/v  t 

inverse  system  problem:  we  want  to  express  the  term  involving  x^f  in  (3.22) 
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by  a  team  involving  x^.  As  we  will  see  in  the  next  few  subsections 
this  cannot  always  be  done,  but  there  are  some  very  important  cases  in 
which  it  can  be  done.  The  most  basic  of  these  is  considered  in  the 
following  subsection. 


This  might  correspond,  for  example,  to  two  separate  measurements  along 
the  same  sets  of  one-dimensional  tracks  or  of  maps  of  the  same  region 
in  the  two  dimensional  field  produced  from  measurements  along  two 
different  sets  of  tracks. 

For  this  case  we  obtain  some  simplification,  as  we  did  in  the 
decentralized  filtering  problem.  Here 


(3.33) 

(3.34) 

(3.35) 

(3.36) 


(3.37) 
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Thus 


Z.  =  -(A'+y  Q-P.  Q)z.  -  pTAfiB., 
is  L  ir*  is  is  if 


z.  =  -(A'+P.^Q)z.  +  P.^Qz. 

is  if  is  is^ir 


Qz..  =  P.  [-z.  -(A'+Y  Q-p7q)z.  ] 

*  i  f  is  is  “  *  ir*  is 


(3.44) 

(3.45) 


ir~  is 


Qz.  =  P.  [z.  +  (A'  +P .  Q)  z .  ] 

*  ir  is  is  if*  is 


(3.46) 


(3.47) 


From  (3. 34)- (3. 38) ,  we  see  that  we  can  use  (3.46),  (3.47)  in  these 

equations  to  replace  x..,  x.  with  x.  .  Thus  in  this  case,  we  can  obtain 

if  ir  is  - 

an  algorithm  of  the  desired  form.  Note  that  we  haven’t  shown  that  we  can 

A  A  A 

recover  x..,  x.  from  x.  ,  or  equivalently  2  z,  from  z,  ,  but  we  have 
if  ir  is  ^  1  if'  ir  is 

seen  that  we  can  recover  Qz..  and  Qz .  ,  and  this  is  all  that  we  need  for 

*  if  *  ir 

our  problem.  Note,  however,  that  the  expressions  (3.46),  (3.47)  for 
these  quantities  involve  derivatives  of  z^.  In  order  to  avoid  these, 
we  must  use  a  feedforward  formulation.  First  of  all,  substituting  (3.36), 
(3.46)  into  (3.34)  we  obtain 


ir  is 


£f  =  Vf "  2  [Vif  -Ilpis[lis  +(a’+£" 

i=l 

From  (3.38)  we  know  that 


(3.48) 


or 


/V  ■  — 1  A  * 

X.  =  P.  P.  X.  +  P.  z . 
IS  IS  IS  IS  IS  IS 


•  A  .  —  1  ^ 

P.  Z  .  =  X.  -  P.  P.  X. 
IS  IS  IS  IS  IS  IS 


(3.49) 


(3.50) 
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Now  define 


qf  =  £f  +  I  [PfP-f  - 
1=1 


Differentiating  this,  using  (3.50),  we  obtain 
2  2 

q-  =  F,[q,  -  l  (PpT  -I)x.  ]  +  V  [  f-  (pX:)]x. 
f  f  f  ■  ,  F  if  is  . dt  f  if  is 
1=1  1=1 


n  —  1  •  —  1  r-.”  1  -1  —  1  A 

-  Z  [p*p-c  -I] [-P.  P.  +  P.  (A'+)  Q-P.  Q) P ,  ]x. 

*-  f  if  1  G  1C  1  G  t-  *  n  V-  1  G  1 


i=l 


is  is 


1-  -T^jpT1)  . 

ir  is  is 


(3.51) 


(3.52) 


We  have  all  of  the  equations  needed  to  simplify  this  equation 
except  for  an  expression  for  P^g.  From  (3.12) 


yr  (p~1)  ~  Jr  (p”*>  +  Jr  (p~1)  "  Jr  (  l> 

dt  is  dt  if  dt  ir  *+  ** 


dt 


(3.53) 


and  using  expressions  for  the  terms  on  the  right-hand  side  (equations 
(3.5),  (3.10)  and  (3.16)  together  with  (3.32),  we  obtain 


(pt1)  =  -p-1a  -  A’p?1  -  pt^qp.1  -  pT^r1  +  ptV:1 

<lt  is  T  1  <=  if  15  =  ^  15  V 


IS 


IS 


■if*  is 


is  ir 


(3.54) 


A  great  deal  of  algebra  then  yields 


q.  =  F  _q,  -  P  _C '  R  ^"C .  x  -  P„  C '  R,  ^C,x„ 
^f  f*f  f  2  2  2  Is  fll  12s 


(3.55) 


Note  that  "2"  -subscripted  matrices  multiply  x^g  and  "1"  -subscripted 

A 

matrices  multiply  x2g. 

We  can  follow  exactly  the  same  ideas  for  the  reverse  filter.  Let 


q  =  £  +  I  [P  P. 1  -I]x. 

r  ^r  . r  ir  i 
i=l 


is 


(3.56) 
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-q  =  F  q  -  P  C  R  C  x  -  P  C,  R.  C,  x 
r  r  r  r  2  2  2  Is  r  1  1  12s 


(3.57) 


Substituting  (3.51)  and  (3.56)  into  (3.33),  we  obtain 

A  |  -1  I  -1  -1  A  -1 

x„  =  po  { pf  'ic  '  1  (p._  -  p.  lx.  +  p  q 

s  2  I  f  f  . if  f  is  r  r 
'  i=l 

-  I  [P.1  -  P_1]x.  +  l  P^x.  \ 

u  ir  r  is  u  ioie\ 


l  P.  x.  \ 
i=l  18  13  f 


(3.58) 


Using  (3.12)  and  (3.14),  we  obtain  the  following  algorithm  for 
combining  smoothed  estimates 


(3.59) 

(3.60) 

(3.61) 


If  we  think  of  optimal  estimates  as  orthogonal  projections  in  spaces 

/\ 

of  random  variables,  then  x  is  the  projection  of  x  onto  V  ,  the  subspace 

IS  X 

A 

spanned  by  the  first  pass  measurements.  Similarly  x  is  the  projection 

^s 

onto  and  xg  is  the  projection  onto  +  V If  and  were 

/V  A  /V 

orthogonal,  i.e.,  independent,  then  x  would  equal  x  +  x  .  However, 

S  IS  4  s 

they  are  not,  and  thus  the  other  terms  in  (3.59)  account  for  this. 

We  can  actually  see  this  point  more  clearly  if  we  look  at  the  smoothing 
update  problem,  that  is,  the  problem  of  computing  xg  in  terms  of  the 
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A 

time  history  of  the  old  smoothed  estimate  x,  and  the  new  data  Y^ . 

The  solution  to  this  problem  is  readily  obtained  in  a  manner  analogous 
to  that  used  in  deriving  (2.32).  That  is,  if  we  perform  all  of  the 
analysis  we  have  just  done,  leaving  y^  alone  and  only  replacing  y^  by 

A  A  A  < 

x  ,  and  eventually  by  x^s ,  linearity  guarantees  that  the  input- 

A  A 

output  relation  from  x  to  x^  is  the  same  as  that  obtained  already. 

Thus,  all  the  work  we  need  to  do  is  already  done,  and  we  can  simply 
write  down  the  solution  to  the  updating  problem: 

(3.62) 

(3.63) 

(3.64) 

Here,  if  we  let  F  denote  the  orthogonal  complement  of  in  —  i *e • 

the  part  of  Y^  that  is  independent  of  Y^,  then  Y  ^+Y  =  ^1  ©  ^ >  an<^  x^s 
is  the  projection  of  x  onto  while  the  remaining  terms  in  (3.62)  are 
the  projection  onto  F. 

Note  also  that  (3.63),  (3.64)  can  be  rewritten  in  the  following 


form 


(3.65) 


(3.66) 


In  the  preceding  subsection  we  saw  that  the  smoothed  estimate 
updating  problem  could  be  solved  when  the  local  and  global  models  are 
identical.  In  this  section  we  look  at  the  problem  when  this  is  not  the 
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case.  Recall  that  the  general  algorithm  had  been  reduced  to  (3.22), 

(3.27),  and  (3.31),  together  with  (3.24)  and  (3.29)  which  define  the 

gains  needed  in  the  algorithm.  Also,  we  have  equations  for  z  ,  z  , 

if  ir 

and  z^s  as  before,  but  with  slight  modifications  due  tc  the  fact  that 
we  have  local  models.  Specifically,  we  have 


z.  =  z  +  z. 
is  if  ir 


•  '  -1  .  -1 

z..  =  -(A.+P..Q.)z  +  H.R.  y. 

if  l  if  i  if  i  l  ^i 


*.  =  - (a’+£.  q.-p. Lq. )z.  -  hIr.S'. 

ir  i  ‘-j.  ir  i  ir  i  i  i 


which  lead  to  the  equations 

Q.z.f  =  P.  [-z.  -(A!+y  1Q.-pT1Q.)2.  1 

*i  if  is  is  i  L  i*i  iryi  is 


Q.z.  =  P.  [z.  + (a! +pT 1Q. ) z .  ] 

i  ir  is  is  i  iF*i  is 


(3.83) 

(3.84) 

(3.85) 

(3.86) 

(3.87) 


In  order  to  proceed  as  we  did  in  the  preceding  section  we  wish  to 

be  able  to  find  some  matrices  L.„,  L.  so  that 

if  ir 


K.  ,x .  =  L._Q.z.,  =  I*.  ,Q. pT^x. 

if  if  if  i  if  if  i  if  if 

K.  x.  =  L.  Q.Z.  =  L.  Q.P_1X. 
ir  ir  ir‘1  ir  in  ir  ir 


This  will  be  possible  if  and  only  if 


(3.88) 

(3.89) 


N(QiPif)  -  N(Kif} 
N(QiPir) ^  N(Kir5 


(3.90a) 

(3.90b) 
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Note  that  if  (3.90a)  and  (3.90b)  hold,  then  L._  and  L.  can  be  chosen 

if  lr 


to  be 


L.,  =  K..P..Q. 
if  if  if  i 


L.  =  K.  P.  Q. 
ir  lr  irxi 


In  this  case,  we  can  substitute  in  for  Q.z.„  and  Q.z.  and  O.z.  using 

*i  if  l  ir  l  ir 

(3.86)  and  (3.87)  and  then  use  a  set  of  steps  similar  to  those  used  in 
Subsection  3.2  to  remove  the  z  -term.  Specifically,  substituting 

(3 . 86)  - (3 . 89)  into  (3.22)  and  (3.27)  we  obtain 


2  •  -i 

L  =  FE  -  7  L..P.  [z.  +  (A!  +  y.XQ.  -  P_1Q . )z .  ] 

f  f  f  if  is  is  i  ‘■i  *i  ir  i  is 

i=l 


"I  -1. 


-5  =  F  5  +  y  L.  P.  [z.  +  (A!  +  P~^Q.  )z .  ] 

r  r  r  . ir  is  is  i  if  l  is 

i=l 


Then,  using  (3.38)  and  (3.50), 


=  F-£_  +  y  L.-tP.  P  X.  -  X.  -  P.  (A!+y.  Q.-pT  Q.)P.  x.  ] 
f  f  f  . if  is  is  is  is  is  i  Lx  l  ir  i  is  is 
i=l 


•  n  A  •  -]^A  “1  “1a 

■E  =  YX*  +  y  L.  [x.  -  P.  P.  X.  +  P.  (A!  +P .  Q .  )  P .  X.  ] 

sr  f sf  . L ,  ir  is  is  is  is  is  l  if  i  is  is 

1=1 


Defining 


q  =  £,  +  y  L. .x. 
f  f  . if  is 
i=l 


1=5+  [  L. 

r  r  . L  l 


i=l 


x. 
ir  is 


* 
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we  find  that 

(3.91a) 

(3.91b) 

(3.91c) 

where 


’f  *  Vt  +  Vis  +  “2f*2s 


-q  =  F  q  +  N  X  +  N„  X^ 

1  r  r  r  lr  Is  2r  2s 


xo  =  P  [pjq,  +  P  1q  +  D  x  +  D„x„  ] 
s  s  f  f  r  r  1  Is  2  2s 


N.-  =  “F  _L  +  L  +  L  P.  PT1  -  L  P.  (A:+y.1Q.-pTJ'Q.  )PT 
if  f  if  if  if  is  is  if  is  i  Li  i  ir*i  i 


-i-  ..-i 

is 


N.  =  -F  L.  -  L.  -  L.  P.  pT1  +  L.  P.  (A!+P-10  )P_1 
ir  r  ir  lr  lr  is  is  ir  is  i  if*i  is 


D.  =  mIpT1  -  pT"Lb.  -  P_1L. 
i  i  is  f  if  r  ir 


(3.92a) 


(3.92b) 

(3.93) 


The  analysis  in  this  case  is  clearly  more  complex,  since  the  equations 
involve  and  Q^(  and  if  these  are  time-varying,  we  will  also  have  to 
consider  their  time  derivatives. 

Note  that  one  obvious  case  in  which  (3.90a)  and  (3.90b)  hold  is 
when  is  invertible.  In  this  case  the  smoother  is,  in  fact,  invertible, 
as  we  can  recover  z ^  and  z  and  consequently  the  original  data  y^, 
since  the  forward  and  reverse  Kalman  filters  are  always  invertible.  In 
the  remainder  of  this  section  we  wish  to  consider  one  other  important 
special  case  from  which  we  can  gain  more  insight  into  the  nature  of  our 
solution . 

Specifically  we  wish  to  consider  the  case  in  which  x^  is  an  actual 


part  of  the  global  state.  As  was  discussed  in  Subsection  2.3,  in  this 


case  (assuming  that 
basis  for  the  state 
components  of  x 


Also,  using  (2.57), 

/[(Pf> 

Klf  =( 

\ 
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is  onto)  we  can  choose  a  (possibly  time-varying ) 
space  so  that  wo  can  identify  x^  with  some  of  the 


equation  (3.25)  becomes 


llPlf  “  I1ClPlf 
12PlfQl  '  Q12lPlf 


(3.94) 


(3.95) 


(3.96) 


(3.97) 


(3.98) 


where 


If  we  write 


then 


(3.100) 


Etw^tjw^x)]  =  Q16(t-x)  (3.101) 

E[w2 (t)w^(x) ]  =  Q|26(t-x)  (3.102) 

From  this  it  is  relatively  easy  to  see  that  there  must  be  a  matrix 
so  that 

Q12  =  T1Q1  (3.103) 


Specifically,  if  we  write  w2  as  w^+w^,  where  w2i  is  the  besb  estimate 
of  w2  given  w^,  then  is  a  linear  function  of  w^,  say  w2j=T^w^,  and 
E[w  <t)w'(x)]-0.  This,  together  with  (3.101)  and  (3.102)  yields  (3.103). 
If  we  substitute  (3.103)  into  (3.98),  we  find  that 


Klf  = 


/(Pf) llPlf 
\ (pf)12plf 


(3.104) 


There.,  ^re. 


L 


If 


(,f>iipu  -  1  \ 

\! 


(3.105) 


We  can  perform  a  similar  analysis  in  reverse  time,  but  the  situation 
is  a  bit  more  complex.  We  will  comment  on  the  reasons  for  this  complication 
shortly,  but  first  we  will  present  the  solution.  For  the  special  case 
described  by  (3.94)- (3.96) ,  equation  (3.30)  reduces  to 


Kir  =  [PrMlPlrMl  *  + 


(3.106) 


where  M^d  .*  0]  ,  q  is  given  by  (3.96),  and  £  and  V 

l 


are  related  by 


l 


(3.107) 


and,  using  a  basic  formula  for  matrix  inverses  for  block  matrices  (see, 
for  example,  [5,  p.495] ) 


^1  +  li  l12  l22~l12  li  l12  I12^i 


-1  ,  -1 


i  -i'  r\ 

22  12^1  L. 


12 


-1 


l  l 

L\2  L1 


-h  lAk2~W12 

w:X 


-i 


(3.108) 


Using  these  relationships  and  also  (3.103),  equation  (3.106)  becomes 

^‘VlAr  - 
"Vu’u  -  W* 

C  ^  fc,  -  X  C  l 

s*  -  c  c  a1 


l  l1  -  T 

12  1  X 


Q.P?1 
1  lr 


(3.109) 
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Therefore 


Llr  = 


(p  )..p71  -  i 

r  11  lr 


-  Tu 


/< p  ),S1i  -(p  i  \ 

r  llLi  Li2  r  12 


VVlS^j  ^12  '“'r,22  I 


k-kcU'lo;1  - 


=  p 


r1 

lr 


1+  p 


,-1 


r1  / 1 


V  1 1 

o 


(3.110) 


Comparing  (3.109)  and  (3.110)  with  (3.104)  and  (3.105),  we  see  that 
the  first  terms  on  the  right-hand  sides  of  (3.109)  and  (3.110)  are 
analogous  to  the  right-hand  sides  of  (3.104)  and  (3.105),  respectively. 
The  additional  terms  in  (3.109)  and  (3.110)  represent  the  complication 
that  arises  in  constructing  reverse  models  for  by  itself  (as  used  in 

A  A 

the  calculation  of  and  x^g)  and  a  reverse  model  for  x  =  (x^,y  )*, 
which  is  used  in  computing  x^  and  xg.  Since  both  x^  and  x  are  Markov, 
we  can  in  fact  obtain  reverse-time  diffusion  models  for  each  of  these. 
Following  reference  [6]_ 


(3.111) 


(3.112) 


where  these  models  can  be  interpreted  as  generating  the  same  sample  paths 
as  the  forward  model.  Here  w^(t)  is  a  white  noise  process  with  strength 
Qj  and  it  represents  that  part  of  w^(t)  that  is  independent  of  the  future 
of  x^,  i.e.  x^(s),  s>t.  Similarly,  w(t)  is  a  white  noise  process  with 
strength  Q,  representing  the  part  of  w(t)  that  is  independent  of  x^(s),  s>t 
and  y(s) ,  s>t.  Because  of  this  difference  in  the  two  models. 


w  (t)  ^w^t)  (3.113) 

An  equivalent  way  of  looking  at  this  is  to  view  (3.111)  as  defining 
(with  w^O)  an  equation  for  the  best  "predictor",  going  in  reverse  time 
of  x^  given  the  future  of  x^.  Similarly  (3.112)  gives  the  best  predictor 
of  x^  and  y  given  the  future  of  x^  and  y.  Now  although  going  forward  in 
time  the  future  of  x^  is  decoupled  from  the  past  of  y,  the  future  of  y 
does  depend  on  the  past  of  x^  (see  equation  (3.95):  A^2=°'  ^ut  A21  nee^ 
not  be  zero).  Therefore,  if  we  want  to  predict  the  past  of  x^  the  future 
of  y  does  provide  us  with  information  (for  example,  the  future  history  of 
position  does  help  us  deduce  something  about  the  past  behavior  of  velocity) . 
For  this  reason,  although  the  (1,2)  block  of  the  forward- time  dynamics 
matrix  in  (3.95)  is  zero,  the  (1,2)  block  of  the  reverse-time  dynamics 
matrix  in  (3.112)  is  not  zero.  What  this  implies  is  that  in  reverse  time, 
x^  does  not  represent  the  state  of  a  subsystem  of  the  global  state,  and  the 


extra  terms  in  (3.109)  and  (3.110)  reflect  this  fact. 
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From  (3.103),  (3.108)  and  (3.112)  we  can  evaluate  the  (1,2)  term 
of  the  reverse  dynamics  matrix: 

♦  Ti]  (w  1 

Comparing  (3.109)  and  (3.114),  we  see  that  (assuming  the  invertibility  of 

P  ,  T  ,  and  J)  the  last  term  in  K,  will  be  zero  if  only  if  (3.114)  is 
r  L  lr 

zero,  that  is,  when  is  the  state  of  a  subsystem  both  in  forward  and 
reverse  time,  and  this  will  be  true  if  and  only  if 


or,  equivalently 


(3.115) 


I’  f1  Q,  -  Q1?  =  0  (3.116) 

It  is  relatively  easy  to  see  that  this  is  the  case  if  A21=®12=(^'  s^nce 
in  that  case  and  y  are  independent.  Other,  essentially  equally 
trivial  cases  can  be  found  in  which  (3.116)  is  satisfied,  but  the  con¬ 
dition  is  quite  restrictive. 

These  observation  notwithstanding,  (3.105)  and  (3.110)  allow  us  to 

A  A  A 

replace  x  and  x.  in  (3.22)  and  (3.27)  by  expressions  involving  x. 

it  ir  IS 

through  the  use  of  (3.91)- (3.93) .  A  similar  analysis  can  be  performed 
for  local  processor  #2  if  x^  is  the  state  of  a  subsystem  of  the  global 
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system  (of  course  the  change  of  basis  needed  on  the  global  state  space 
to  put  the  system  into  a  form  analogous  to  (3.95)  will  in  general  be 
different) .  Also,  we  can  consider  the  case  of  map  updating  in  which  we 

A  A 

wish  to  compute  x  from  x,  and  y„.  These  results  follow  in  much  the  same 
s  Is  2 

manner  as  those  derived  in  Section  3.2: 


'  +  A  rr  +  “As1 

(3.117) 

*  Vf  +  "lAs  +  vAA 

(3.118) 

‘  Frrr  +  "lAs  *  PrCAA 

(3.119) 

where  D^,  N^,  and  N  are  defined  in  (3.92)  and  (3.93).  These  equations 
hold  whenever  and  exist.  For  example,  if  x^  is  the  state  of  a 
subsystem  (forward  in  time) ,  then  (assuming  that  a  basis  has  been  chosen 
as  in  (3.95))  (3.92)  and  (3.93)  are  computed  using  M^fllO]  and  and 
defined  in  (3.105)  and  (3.110).  In  this  case,  some  algebra  yields 


■  (; ) 


Nlf  =  -PfC2 


+  (wrvJ 


(3.120) 


(3.121) 


N.  =  -P 
lr 


( Wr  Vi ) 


(3.122) 
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XV.  REAL-TIME  SMOOTHING 

A  variant  of  the  problem  addressed  in  the  preceding  section  is  the 
real-time  smoothing  problem.  In  this  case,  from  some  previous  pass  we 

A 

have  observed  y^(t)  over  the  interval  [0,T]  and  have  produced  x^it) . 

We  new  observe  y 2  up  to  time  t,  and  we  wish  to  compute  the  real-time 
smoothed  estimate  of  the  global  state,  i.e. 

xrs(t)  =  E[x(t)  lyx  (T)  ,  0<t<T,  y2(0),  0<o<t]  (4.1) 

in  terms  of  x^g  and  y^.  This  formulation  is  motivated  by  problems  in 

which  a  second  traversing  of  a  track  across  a  random  field  is  taken  in 

real  time  and  we  wish  to  process  the  data  as  we  get  it.  If  x  =x  then 

i  , 

the  tracks  are  identical.  If  x^  is  the  state  of  a  subsystem  of  the  global 
system,  i.e.  x'=(x^,y'),  then  there  are  two  possible  motivations.  The 
first  is  that  in  which  x  represents  several  tracks  across  the  field  and 
y2  may  be  data  from  one  of  these  other  tracks .  Alternatively ,  y  may 
represent  the  state  of  a  dynamic  system  which  is  affected  by  the  field, 
modeled  by  x^,  during  the  second  pass.  For  example  x^  might  represent 
anomalies  in  the  earth's  gravitational  field  and  y  could  represent  errors 
induced  in  an  inertial  navigation  system  aboard  a  ship  [2,  9].  In  this 
case  we  want  the  (real-time)  estimates  of  y.  Clearly  we  can  also  model 
in  this  same  way  the  case  in  which  y  contains  two  pieces,  one  of  which 
models  additional  tracks  and  the  other  models  the  state  of  a  dynamic  system 
affected  by  the  random  field  along  the  second  track. 

The  solution  to  this  problem  can  be  obtained  directly  from  the 
results  in  the  preceding  section.  Specifically,  at  any  time  t  we  can  view 
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(4.1)  as  performing  two  full  passes  over  [0,T] ,  but  at  any  time  t  we 
assume  that 

c2(s)  =  H2(s)=0  t<s<T  (4.2) 

A 

Also,  since  we  are  attempting  to  compute  x  directly  using  y2,  we  have 
essentially  a  smoothing  update  problem.  Based  on  these  observations  (3.31), 
(3.22),  and  (3.27)  can  be  adapted  to  the  present  situation: 


~  -1  -1  .  -I-' 

x  =  P  [P  +  P.  L  +  M  P  x  ] 
rs  rs  f  ^*f  b  b  1  Is  Is 


h  •  rf5f +  KK5if +  v&X 


X  •  Fb«b  +  V: 


lb 


where  Pf,  P^g,  Ff,  and  Klf,  are  as  before, 
covariance  for  x  based  on  y^  alone : 


-Pb  =  -(A+Qj  L]Pb  -  Pb(A+Q^  1]'  +  Q  - 


(4.3) 

(4.4) 

(4.5) 


and  P,  is  the  reverse  error 

D 


V^iVb 


(4.7) 


AlSO 


-1 


p-1  =  p:1  +  p:1  -  i 

rs  f  b  z- 


Fb  -  "A  -  d"1  -  PbCiRIlcI 


K  =  [p.  m'pTVpT1  -  qm'p"1] 
lb  b  1  lr*l  lr  *  1  lr 


(4.8) 

(4.9) 

(4.10) 


(4.20) 


-50- 


This  is  not  surprising,  since  P^  is  the  error  covariance  for  the 

b 

estimate  of  x  based  on  the  future  of  v, ,  P,  is  the  covariance  of  the 

1  lr 

estimation  error  for  x^  based  on  the  future  of  y^,  and  in  this  case 
x=x^.  What  (4.19)  and  (4.20)  also  imply  is  that 

K,.=0  (4.21) 

lb 

which  in  turn  implies  that  9^=0.  Thus  there  is  no  backward  processing 
in  this  case.  Again  this  is  not  surprising,  since  the  future  data  at 
time  t  is  just  (y^(s),  s<t},  as  y^is) ,  s<t  has  not  yet  been  collected, 
and  since  x^=x ,  the  future  of  y^  has  already  been  processed  optimally  in 

/v 

producing  x^.  Also  for  this  case 


Klf  "  [P£Pii  -  11 «P 


-1 

If 


(4.22) 


and  the  real-time  smoothing  solution  is  recursive  and  is  given  by 


A  —  1  /\ 

x  =  P  P^  q_  +  x, 

rs  rs  f  f  Is 


qf  =  Ffqf  +  PfC’R;  [y2  -  C^] 


(4.23) 

(4.24) 


The  other  important  case  of  interest  is  that  in  which  x^  is  the 
state  of  a  subsystem  of  the  global  system.  Specifically,  assume  that 
(3.94)- (3.103)  hold.  In  this  case  (3.117)- (3.122)  can  be  adopted  to  yield 


PrsEl  (tJ 


(4.25) 
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“if  '  -P£C2R2IC2  T. 


WaV'A, 


(4.26) 


Nlb  = 


‘  'Tl+A21+A22TrTlAl 


(4.27) 


Equations  (4.13)- (4.15) ,  (4.25)- (4.27)  define  the  algorithm  for 
real-time  smoothing.  Note  that  the  new  data  (y^)  is  processed  only 
forward  in  time,  while  the  reverse  processing  could  be  precomputed,  since 

A 

it  only  involves  x^s .  The  interpretation  of  this  reverse  processing 
deserves  some  comment.  Of  course  it  is  zero  if  x^=x.  What  it  does 
represent  essentially  is  a  reconstruction  of  the  reverse  filtered  estimate 
of  x  based  only  on  y^,  given  the  reverse  filtered  estimate  x^  of  x^ 
based  on  y^.  This  is  very  much  like  what  was  discussed  in  Section  2.3 
when  one  wishes  to  reconstruct  the  unobservable  feedforward  part  of  x 
from  the  filtered  estimate  of  the  observable  part  x^.  However  there  is 
a  difference  because,  as  mentioned  in  Section  3.3,  x^  is  not  a  substate 
of  x  in  reverse  time.  If  it  were,  then  given  x^r#  the  top  block  of  N^ 
would  have  to  be  zero,  since  the  best  estimate  of  Xjbased  on  the  future 
of  y  would  have  to  be  x,  .  However  the  first  part  of  N„L  is  not  zero, 
reflecting  the  fact  that  the  reverse  dynamics  for  x^  alone  are  different 
from  those  when  x^  is  viewed  as  some  set  of  the  components  of  x.  In  the 
latter  case,  x^  is  not  a  Markov  process  in  reverse  time. 


i 
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V.  DISCUSSION 

In  this  paper  we  have  considered  the  problems  of  combining 
estimates  obtained  from  several  separate  data  sources  which  have  been 
processed  individually  and  of  updating  an  estimate  as  another  source  of 
information  becomes  available.  In  Section  II  we  examined  the  causal 
version  of  this  problem  and  have  obtained  a  solution  under  very  general 
conditions.  Basically,  the  only  restriction  on  the  local  processing  is 
that  the  model  on  which  it  is  based  have  as  many  degrees  of  freedom  as 
there  are  in  the  observations  that  are  to  be  processed  locally.  We 
discussed  the  potential  utility  of  these  results  for  distributed  imple¬ 
mentation  of  Kalman  filters  and  for  efficient  transmission  of  information 
from  local  processors  to  a  central  processing  facility. 

Several  directions  for  further  work  are  suggested  by  the  results  of 
Section  II.  The  first  is  in  decentralized  estimation.  Consider  the 
situation  in  which  the  local  models  x^  and  x^  represent  different  pieces 
of  the  state  x.  In  general  these  pieces  will  be  coupled,  although  the 
local  processors  assume  that  there  is  no  coupling.  Given  that  the  global 
processor  does  take  this  coupling  into  account,  is  there  an  efficient 
distributed  fashion  in  which  each  local  estimate  can  be  corrected  using 
the  estimate  produced  by  the  other  local  processor?  If  the  coupling 
between  x^  and  x is  weak,  is  there  some  asymptotic  description  of  this 
correction?  What  if  there  are  different  time  scales?  For  example,  suppose 
the  local  processors  estimate  fast  and  slow  states  but  all  that  is  wanted 
globally  is  an  estimate  of  the  slow  global  states.  The  results  in  [10-12] 
on  multiple  time  scale  estimation,  combined  with  our  framework  should 
provide  the  basis  for  a  solution  to  such  a  problem. 
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A  second  problem  suggested  by  Section  II  is  that  of  efficient  dis¬ 
tributed  implementation  of  Kalman  filters.  IWo  types  of  issues  enter 
here:  (1)  the  amount  of  computation  that  is  done  by  each  local  processor; 

and  (2)  the  efficient  transmission  of  information  to  the  central  processor. 

If  in  fact  the  only  issue  were  the  second  one,  then  the  answer  would  be 
that  each  processor  should  whiten  the  observed  data  y^  and  transmit  the 
result.  In  other  words,  each  local  processor  should  build  a  global 
Kalman  filter  and  transmit  the  resulting  innovations.  Remember  that  the 
local  Kalman  filter  innovations  will  not  be  white  because  of  discrepancies 
between  local  and  global  models.  Given  that  there  are  constraints  on  the 
amount  of  computation  that  can  be  performed  locally,  the  question  of  what 
to  transmit  is  a  complex  one.  Specifically,  given  communication  capacity 
and  local  computation  constraints  the  problem  becomes  one  of  what  local 
processing  and  subsequent  data  transmission  scheme  is  best  in  the  sense  of 
degrading  the  global  estimate  as  little  as  possible.  Our  results  may  provide 
one  perspective  from  which  we  can  make  inroads  into  this  very  difficult 
problem. 

In  Sections  III  and  IV  we  considered  noncausal  versions  of  the 
combining  and  update  problems.  These  results  are  of  potential  use  in  some 
mapping  problems.  In  addition,  they  raise  as  many  question  as  they  answer. 
Specifically,  the  noncausal  estimate  combining  problem  does  not  always 
have  a  solution.  The  reason  for  this  is  that  the  noncausal  local  processing 
may  lose  some  information  that  is  needed  for  the  global  processing.  We 
presented  several  important  cases  where  this  does  not  happen,  but  the  issue 
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remains  of  characterizing  precisely  what  information  from  the  raw  data 

y^(x),  0 <t<T ,  is  preserved  in  the  local  smoothed  estimate  history 

x.  (t) ,  0<t<T. 
is  - 

Beyond  this  there  remains  the  issue  of  interpreting  the  results  of 
Section  III  and  IV.  The  very  simple  form  of  the  solution  in  some  cases, 
such  as  in  (3.65)  and  (3.66)  suggests  that  there  must  be  a  simpler 
derivation  and  interpretation  of  our  results  than  the  one  we  have  given. 

For  example,  the  framework  of  scattering  theory  [13]  may  provide  the 
machinery  necessary  to  simplify  our  analysis  and  add  to  our  insight.  Also, 
as  suggested  in  the  text  reference  to  (3.65)  and  (3.66),  one  interpretation 
of  our  map  updating  results  is  that  the  second  pass  data  are  used  to 
estimate  the  map  errors  from  the  first  pass.  The  fact  that  we  have  been 
able  to  determine  how  this  can  be  done  using  two  recursive  systems  (one 
causal  and  one  anticausal)  suggests  that  this  second  pass  processing  is 
based  on  a  recursive  model  for  the  map  errors .  This  suggests  the  notion 
of  conditional  stochastic  realizations ,  which  at  this  time  remains  as  just 
a  notion.  The  development  of  substance  for  this  notion  map  provide  the 
basic  insight  needed  to  understand  our  results  from  first  principles. 

Finally,  there  is  the  extension  of  our  map  updating  formulation  to 
more  general  scenerios  (non-parallel  tracks ,  point  measurements  as  well  as 
tracks)  and  more  general  random  field  models.  As  we  have  discussed  in 
Section  III,  the  resulting  problems  will  probably  be  infinite  dimensional 
in  nature.  While  this  is  a  technical  difficulty,  it  need  not  be  a  conceptual 
one.  The  results  we  have  obtained  and  the  notion  put  forth  of  realizations 


for  map  error  fields  should  be  useful  in  general.  In  addition,  our 
results  should  directly  carry  over  to  discrete  2-D  fields,  in  which  case 
the  generalization  to  more  general  scenarios  need  not  be  as  difficult 
technically .  The  development  of  a  more  general  theory  for  the  efficient 
assimilation  of  spatially-distributed  data,  either  in  continuous-  or 
discrete-space,  is  an  extremely  important  problem  with  a  myriad  of  potential 
applications.  It  is  our  hope  and  feeling  that  our  results  have  provided 
some  concepts  that  can  be  useful  in  developing  that  theory. 


-56- 


REFERENCES 


1 •  W.H.  Ledsham  and  D.H.  Staelin,  "An  Extended  Kalman-Bucy  Filter  for 

Atmospheric  Temperature  Profile  Retrieval  with  a  Passive  Microwave 
Sounder,"  J.  Applied  Meteorology,  Vol.  17,  No.  7,  July  1978, 
pp.  1023-1033. 

2.  R.A.  Nash,  Jr.  and  S.K.  Jordan,  "Statistical  Geodosy  —  An  Engineering 
Perspective,"  Proc.  IEEE,  Vol.  66,  No.  5,  May  1978,  pp.  532-550 

3.  J.L.  Speyer,  "Computation  and  Transmission  Requirements  for  a  Decen¬ 
tralized  Linear-Quadratic-Gaussian  Control  Problem,"  IEEE  Trans. 

Aut.  Control,  Vol.  AC-24,  No.  2,  April  1979,  pp.  266-269 

4.  J.E.  Wall,  Jr.,  A.S.  Willsky,  and  N.R.  Sandell,Jr. ,  "On  the  Fixed- 
Interval  Smoothing  Problem,"  submitted  to  Stochastics . 

5.  F.C.  Schweppe,  Uncertain  Dynamic  Systems,  Prentice-Hall,  Inc.  Englewood 
Cliffs,  N.J.,  1973. 

6.  G. Verghese  and  T.  Kailath,  "A  Further  Note  on  Backwards  Markovian  Models," 
IEEE  Trans.  Inf.  Theory,  Vol.  IT-25,  No.  1,  Jan.  1979,  pp.  121-124. 

7.  C.Y.  Chong,  "Hierarchical  Estimation,"  presented  at  the  Second  MIT/ONR 
Workshop  on  Distributed  Information  and  Decision  Systems  Motivated 

by  Naval  Command-Control-Communication  (C^)  Problems,  Naval  Postgraduate 
School,  Monterey,  Calif.,  July  1979. 

8.  s.  Attasi,  "Modelling  and  Recursive  Estimation  for  Double  Indexed  Sequences," 
in  System  Identification:  Advances  and  Case  Studies,  edited  by  R.K.  Mehra 
and  D.G.  Lainiotis,  Academic  Press,  New  York,  1976. 

9.  A.S.  Willsky  and  N.R.  Sandell,  "The  Stochastic  Analysis  of  Dynamic  Systems 
Moving  Through  Random  Fields,"  to  be  submitted  for  publication. 

10.  A.H.  Haddad,  "Linear  Filtering  of  Singularly  Perturbed  Systems,"  IEEE 
Trans .  Aut .  Control ,  Vol.  AC-21,  No.  4,  Aug.  1976,  pp.  515-519. 

11.  D.  Teneketzis  and  N.R.  Sandell,  Jr.,  "Linear  Regulator  Design  for  Stochastic 
Systems  by  a  Multiple  Time  Scales  Method,"  IEEE  Trans .  Aut .  Contr . , 

Vol.  AC-22 ,  No.  4,  1977.  " 

12.  C.E.  Hutchinson  and  N.A.  Acharva,  "Singularly  Perturbed  Estimation  Using 
the  Method  of  Multiple  Time  Scales,"  Proc.  1979  JACC,  Denver,  Colo., 

pp.  629-633. 


L.  Ljung,  T.  Kailath  and  B.  Friedlander,  "Scattering  Theory  and  Linear 
Least  Squares  Estimation  -  Part  I:  Continuous-Time  Problems,"  proc. 
IEEE,  Vol.  64,  Jan.  1976,  pp.  131-139. 


R.F.  Brammer,  "Estimation  of  the  Ocean  Geoid  near  the  Blake  Escarpment 
Using  GEOS- 3  Satellite  Altimetry  Data,"  submitted  to  J.  Geophys.  Research. 


I 


